76        Bioinformatics

Add the following to the end of the file:

export PATH=”your_path/bowtie2”:$PATH

Do not forget to change “your_path” with the right path on your computer. Save the file

and exit. You may need to restart the terminal or run “source .bashrc” to make the change

active. Then, you can enter “bowtie2” on the terminal. If Bowtie2 is installed and its path

was set, help screen will be displayed.

Before read mapping, we need to use “bowtie2-build” command to index the FASTA

sequence of the reference genome. Enter “bowtie2-build” on the command line of the ter-

minal to display the help screen that shows the usage and options. The general syntax is as

follows:

bowtie2-build [options] <reference_in> <ebwt_outfile_base>

The “bowtie2-build” command requires a FASTA file of a reference genome as an input

and a prefix string which is added as a prefix to the file names of the index. The following

command indexes the human genome for Bowtie2. Before running the command, make

sure that the current working directory is a one-level out “refgenome” directory, where we

downloaded the human genome.

bowtie2-build \

--threads 4 \

refgenome/GRCh38.p13_ref.fna \

refgenome/bowtie2

The indexing may take around 25 minutes using four processors on a computer with 32G

of memory. The “bowtie2-build” command generates six index files prefixed with the pre-

fix string provided for the command. Pre-built indexes for some organisms can also be

downloaded from the official Bowtie2 website.

After indexing the reference genome, we can use “bowtie2” command to align the

paired-end reads and to generate SAM file:

bowtie2 -x refgenome/bowtie2 \

-1 data/SRR769545_1.fastq.gz \

-2 data/SRR769545_2.fastq.gz \

-S sam/SRR769545_bowtie2.sam

Instead of “-S” option to generate a SAM file, we can use “-b” option to generate a BAM

file. To learn more about Bowtie2’s options, enter “bowtie2” on the command line of the

Linux terminal.

2.3.2.3  STAR

STAR, which stands for Spliced Transcripts Alignment to a Reference, is a fast read aligner

developed to handle the alignment of massive number of RNA-Seq reads. Its alignment